Vector Autoregression

Week 12

Evie Zhang

Old Dominion University

Topics

  • Vector Autoregression

Autoregression

\[Y_t = f(Y_{t-1}, ..., Y_{t-p})\]

  • AR(p) models suggest the history of some variable will teach us about its future.
  • We use adf.test() to test for stationarity.
  • We use acf() and pacf() to examine the lag structure.
  • auto.arima is our friend.
  • But what about other factors?

Distributed Lag Models

\[\begin{align}Y_t = f(&Y_{t-1}, ..., Y_{t-p},\\ &X_{t-1}, ..., X_{t-q})\end{align}\]

  • ARDL(p, q) models suggest the history of another variable will teach us about our variable’s future.
  • We still need stationarity and a lag structure.
  • To forecast with this, we often need multiple equations/models.

Vector Autoregression

Chris Sims’s Macroeconomics and Reality (1980) was hugely influential in empirical macroeconomics.

  • Previously, most people would use ARDL type models.
  • The issue with these models is that they do not allow for “feedback”.

Vector Autoregression

Motivation:

\[Y_{t} = \alpha_{11} Y_{t-1} + \beta_{11} X_{t-1} + e_{1t}\]

  • \(X\) “causes” \(Y\), but \(Y\) cannot influence \(X\).

Vector Autoregression

Solution:

\[Y_{t} = \alpha_{1} Y_{t-1} + \beta_{1} X_{t-1} + e_{1t}\] \[X_{t} = \alpha_{2} Y_{t-1} + \beta_{2} X_{t-1} + e_{2t}\]

Now we can model \(X\) and \(Y\) as functions of past values of one another.

This will allow for this “feedback”.

Vector Autoregression

Of course, errors to this system might be correlated. Therefore, we must decompose them into independent shocks.

\[\begin{align} e_{1t} &= u_{1t} \\ e_{2t} &= \rho e_{1t} + u_{2t} \\ &= \rho u_{1t} + u_{2t} \end{align}\]

Vector Autoregression

Solution:

\[\begin{align}Y_{t} &= \alpha_{1} Y_{t-1} + \beta_{1} X_{t-1} + u_{1t}\\ X_{t} &= \alpha_{2} Y_{t-1} + \beta_{2} X_{t-1} + \rho u_{1t} + u_{2t}\end{align}\]

Important: this ordering has implications on your model. Here, shocks to \(Y\) impact both \(Y\) and \(X\) at the same time. However, a shock to \(X\) only gets to \(Y\) next period through the lagged \(X\).

Vector Autoregression

How do we estimate these models in R?

Use the package vars (reference manual)

VAR() is the main function

  • y: the dataset (a ts object)
  • p: number of lags
  • lag.max: determines the highest lag when auto-selecting
  • type: “none”, “const”, “trend”, “both”
  • season: integer telling the frequency
  • exogen: exogenous variables (no feedback)

Vector Autoregression

How do we estimate these models in R?

Use the package vars (reference manual)

restrict() is a function to estimate restricted VARs

  • x: a fitted object from VAR()
  • method: “ser” (automatic based on significance) or “manual”

Vector Autoregression

How do we estimate these models in R?

Use the package vars (reference manual)

VARselect() is a function to help select lag length

  • Use the same things as in VAR().

Vector Autoregression

How do we estimate these models in R?

Use the package vars (reference manual)

irf() is a function to help you interpret VAR output. There are many coefficients that are very difficult to interpret, so irf() will guide your interpretation.

Vector Autoregression

How do we estimate these models in R?

Use the package vars (reference manual)

SVAR() is a function to estimate a structural VAR. This is ultimately beyond the scope of the course, but used in more “serious” macroeconomics.

VAR Simulation

Code
N <- 200
y <- rep(0, N)
x <- rep(0, N)
shocksy <- rnorm(N, sd = .2)
shocksx <- .5*shocksy + rnorm(N, sd = .2)
for(i in 3:N){
  
  y[i] <- .4*y[i-1] - .2*y[i-2] - .3*x[i-1] + .6*x[i-2] + shocksy[i]
  x[i] <- .5*y[i-1] - .5*y[i-2] + .5*x[i-1] + .4*x[i-2] + shocksx[i]
}

sim <- ts(cbind(y, x), start = 1, frequency = 1)
head(sim)
Time Series:
Start = 1 
End = 6 
Frequency = 1 
             y           x
1  0.000000000  0.00000000
2  0.000000000  0.00000000
3 -0.055596983  0.08129699
4 -0.070557695  0.03195294
5 -0.002703492 -0.10851724
6  0.118393831  0.10223881

VAR Simulation

Code
par(mfrow = c(1, 2))
plot(y, type = "l"); plot(x, type = "l")

VAR Simulation

Code
tseries::adf.test(sim[,1])
tseries::adf.test(sim[,2])

    Augmented Dickey-Fuller Test

data:  sim[, 1]
Dickey-Fuller = -4.6124, Lag order = 5, p-value = 0.01
alternative hypothesis: stationary


    Augmented Dickey-Fuller Test

data:  sim[, 2]
Dickey-Fuller = -3.8952, Lag order = 5, p-value = 0.01553
alternative hypothesis: stationary

VAR Simulation

Code
acf(sim)

Code
pacf(sim)

VAR Simulation

Code
lagz <- VARselect(sim)
lagz$selection
AIC(n)  HQ(n)  SC(n) FPE(n) 
     2      2      2      2 

VAR Simulation

Code
# y[i] <- .4*y[i-1] - .2*y[i-2] - .3*x[i-1] + .6*x[i-2] + shocksy[i]
# x[i] <- .5*y[i-1] - .5*y[i-2] + .5*x[i-1] + .4*x[i-2] + shocksx[i]
reg <- VAR(sim, p = 2, season = NULL, type = "none", exogen = NULL); reg

VAR Estimation Results:
======================= 

Estimated coefficients for equation y: 
====================================== 
Call:
y = y.l1 + x.l1 + y.l2 + x.l2 

      y.l1       x.l1       y.l2       x.l2 
 0.4222966 -0.3021403 -0.2895598  0.5779540 


Estimated coefficients for equation x: 
====================================== 
Call:
x = y.l1 + x.l1 + y.l2 + x.l2 

      y.l1       x.l1       y.l2       x.l2 
 0.4275707  0.4589329 -0.3075500  0.3607856 

VAR Simulation

Code
impulse <- irf(reg, impulse = "y", response = "y")
plot(impulse)

Code
impulse <- irf(reg, impulse = "y", response = "x")
plot(impulse)

Code
impulse <- irf(reg, impulse = "x", response = "y")
plot(impulse)

Code
impulse <- irf(reg, impulse = "x", response = "x")
plot(impulse)

VAR Simulation

Variance Decomposition: % of variance due to the other variable

Code
plot(fevd(reg))

VAR Simulation

Code
pred <- predict(reg, n.ahead = 20)
plot(y, type = "l", xlim = c(1, N + 20))
lines((N+1):(N+20), pred$fcst$y[,1], col = "tomato")

Code
pred <- predict(reg, n.ahead = 20)
plot(x, type = "l", xlim = c(1, N + 20))
lines((N+1):(N+20), pred$fcst$x[,1], col = "tomato")
lines((N+1):(N+20), pred$fcst$x[,2], col = "tomato", lty = 2)
lines((N+1):(N+20), pred$fcst$x[,3], col = "tomato", lty = 2)

Code
pred <- predict(reg, n.ahead = 20)
fanchart(pred, names = "x")

Los Angeles Air Quality

LA Air Quality

There is an R package astsa (Applied Statistical Time Series Analysis) that accompanies two textbooks (Time Series Analysis and Its Applications: With R Examples; Time Series: A Data Analysis Approach using R). We are going to use some data from this package.

These data are at the weekly level from Los Angeles County from 1970 - 1979

  • Average weekly cardiovascular mortality
  • Temperatures
  • Air Quality (pollution particulates)
Code
# Average weekly cardiovascular mortality in Los Angeles County
# 508 six-day smoothed averages: 1970-1979
cmort <- astsa::cmort

# Temperatures from LA pollution study
tempr <- astsa::tempr

# Air Quality (particulates) from LA pollution study
part <- astsa::part

LA Air Quality

Code
plot(cmort)

Code
plot(tempr)

Code
plot(part)

LA Air Quality

Code
tseries::adf.test(cmort)

    Augmented Dickey-Fuller Test

data:  cmort
Dickey-Fuller = -5.4125, Lag order = 7, p-value = 0.01
alternative hypothesis: stationary
Code
tseries::adf.test(tempr)

    Augmented Dickey-Fuller Test

data:  tempr
Dickey-Fuller = -4.4572, Lag order = 7, p-value = 0.01
alternative hypothesis: stationary
Code
tseries::adf.test(part)

    Augmented Dickey-Fuller Test

data:  part
Dickey-Fuller = -4.493, Lag order = 7, p-value = 0.01
alternative hypothesis: stationary

LA Air Quality

Code
la <- ts.union(cmort = cmort,
               tempr = tempr,
               part = part)

Order Assumptions:

  • A shock to mortality impacts mortality, temperature, and particles
  • A shock to temperature impacts temperature and particles
  • A shock to particles impacts particles

LA Air Quality

Code
la <- ts.union(part = part,
               tempr = tempr,
               cmort = cmort)

Order Assumptions:

  • A shock to particles impacts particles, temperature, and mortality
  • A shock to temperature impacts temperature and mortality
  • A shock to mortality impacts mortality

LA Air Quality

Code
acf(la, lag.max = 52)

Code
pacf(la, lag.max = 52)

LA Air Quality

Code
lagz <- VARselect(la,          # dataset
                  lag.max = 26,# half-year lags as a max
                  season = 52) # controlling for seasonality via week FEs
lagz$selection
AIC(n)  HQ(n)  SC(n) FPE(n) 
     5      2      2      5 

LA Air Quality

Code
reg <- VAR(la, p=5, type="both", season = 52)
summary(reg)

VAR Estimation Results:
========================= 
Endogenous variables: part, tempr, cmort 
Deterministic variables: both 
Sample size: 503 
Log Likelihood: -4687.423 
Roots of the characteristic polynomial:
0.8598 0.7655 0.6828 0.6828 0.663 0.663 0.6413 0.6281 0.6281 0.6194 0.6194 0.5488 0.5488 0.5222 0.5222
Call:
VAR(y = la, p = 5, type = "both", season = 52L)


Estimation results for equation part: 
===================================== 
part = part.l1 + tempr.l1 + cmort.l1 + part.l2 + tempr.l2 + cmort.l2 + part.l3 + tempr.l3 + cmort.l3 + part.l4 + tempr.l4 + cmort.l4 + part.l5 + tempr.l5 + cmort.l5 + const + trend + sd1 + sd2 + sd3 + sd4 + sd5 + sd6 + sd7 + sd8 + sd9 + sd10 + sd11 + sd12 + sd13 + sd14 + sd15 + sd16 + sd17 + sd18 + sd19 + sd20 + sd21 + sd22 + sd23 + sd24 + sd25 + sd26 + sd27 + sd28 + sd29 + sd30 + sd31 + sd32 + sd33 + sd34 + sd35 + sd36 + sd37 + sd38 + sd39 + sd40 + sd41 + sd42 + sd43 + sd44 + sd45 + sd46 + sd47 + sd48 + sd49 + sd50 + sd51 

          Estimate Std. Error t value Pr(>|t|)    
part.l1   0.049295   0.065131   0.757 0.449538    
tempr.l1 -0.208162   0.113626  -1.832 0.067637 .  
cmort.l1  0.123986   0.100487   1.234 0.217923    
part.l2   0.144762   0.063992   2.262 0.024179 *  
tempr.l2 -0.001549   0.114346  -0.014 0.989198    
cmort.l2 -0.102222   0.105310  -0.971 0.332252    
part.l3   0.179490   0.064698   2.774 0.005770 ** 
tempr.l3 -0.265257   0.115915  -2.288 0.022594 *  
cmort.l3  0.024809   0.110437   0.225 0.822361    
part.l4   0.239045   0.064091   3.730 0.000217 ***
tempr.l4 -0.312321   0.114887  -2.719 0.006820 ** 
cmort.l4 -0.035780   0.104821  -0.341 0.733011    
part.l5   0.023776   0.064868   0.367 0.714149    
tempr.l5  0.085251   0.115150   0.740 0.459489    
cmort.l5 -0.147366   0.099896  -1.475 0.140883    
const    82.790530  17.824775   4.645 4.52e-06 ***
trend    -0.005258   0.004783  -1.099 0.272252    
sd1       2.362927   4.876084   0.485 0.628207    
sd2      -2.306754   4.870991  -0.474 0.636044    
sd3      -3.060327   4.991713  -0.613 0.540143    
sd4      -5.864355   4.950999  -1.184 0.236870    
sd5       1.915231   5.015943   0.382 0.702775    
sd6      -4.620982   4.986433  -0.927 0.354591    
sd7      -7.212084   5.121048  -1.408 0.159750    
sd8      -3.713790   5.300526  -0.701 0.483899    
sd9      -5.748112   5.339170  -1.077 0.282259    
sd10     -5.039454   5.425063  -0.929 0.353445    
sd11      2.470759   5.572720   0.443 0.657721    
sd12     -3.064717   5.734745  -0.534 0.593330    
sd13      2.305934   5.780915   0.399 0.690172    
sd14     -2.821275   5.814388  -0.485 0.627762    
sd15     -0.436539   5.937560  -0.074 0.941425    
sd16     -3.368925   6.053437  -0.557 0.578134    
sd17      0.404534   6.225082   0.065 0.948216    
sd18      5.816921   6.293288   0.924 0.355840    
sd19      5.508341   6.539902   0.842 0.400102    
sd20     11.960828   6.575750   1.819 0.069610 .  
sd21      6.443608   6.734657   0.957 0.339208    
sd22      9.174171   6.825427   1.344 0.179611    
sd23      6.979634   6.952467   1.004 0.315981    
sd24     16.166939   7.027334   2.301 0.021888 *  
sd25     13.393565   7.063550   1.896 0.058602 .  
sd26      8.787142   7.145392   1.230 0.219450    
sd27      9.982003   7.149967   1.396 0.163399    
sd28      9.154440   7.164670   1.278 0.202030    
sd29     11.900858   7.159697   1.662 0.097193 .  
sd30     12.140906   7.068455   1.718 0.086578 .  
sd31     12.455307   7.074839   1.761 0.079024 .  
sd32     14.969761   6.931012   2.160 0.031332 *  
sd33     14.121541   6.885372   2.051 0.040871 *  
sd34     15.636215   6.831606   2.289 0.022569 *  
sd35     18.742345   6.692165   2.801 0.005327 ** 
sd36     22.713167   6.468197   3.512 0.000492 ***
sd37     12.432550   6.363695   1.954 0.051381 .  
sd38     20.203314   6.243914   3.236 0.001306 ** 
sd39     17.607029   5.920029   2.974 0.003102 ** 
sd40     14.750642   5.781377   2.551 0.011070 *  
sd41     13.700734   5.637624   2.430 0.015493 *  
sd42     10.356809   5.565891   1.861 0.063452 .  
sd43     17.584821   5.310093   3.312 0.001005 ** 
sd44     10.363581   5.402208   1.918 0.055715 .  
sd45     11.652398   5.159027   2.259 0.024400 *  
sd46      6.454610   5.179864   1.246 0.213400    
sd47      5.557025   5.208122   1.067 0.286567    
sd48      3.248403   5.172188   0.628 0.530299    
sd49      7.915153   5.226536   1.514 0.130646    
sd50      4.050254   5.123964   0.790 0.429694    
sd51      6.244605   4.926960   1.267 0.205678    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 9.951 on 435 degrees of freedom
Multiple R-Squared: 0.6256, Adjusted R-squared: 0.568 
F-statistic: 10.85 on 67 and 435 DF,  p-value: < 2.2e-16 


Estimation results for equation tempr: 
====================================== 
tempr = part.l1 + tempr.l1 + cmort.l1 + part.l2 + tempr.l2 + cmort.l2 + part.l3 + tempr.l3 + cmort.l3 + part.l4 + tempr.l4 + cmort.l4 + part.l5 + tempr.l5 + cmort.l5 + const + trend + sd1 + sd2 + sd3 + sd4 + sd5 + sd6 + sd7 + sd8 + sd9 + sd10 + sd11 + sd12 + sd13 + sd14 + sd15 + sd16 + sd17 + sd18 + sd19 + sd20 + sd21 + sd22 + sd23 + sd24 + sd25 + sd26 + sd27 + sd28 + sd29 + sd30 + sd31 + sd32 + sd33 + sd34 + sd35 + sd36 + sd37 + sd38 + sd39 + sd40 + sd41 + sd42 + sd43 + sd44 + sd45 + sd46 + sd47 + sd48 + sd49 + sd50 + sd51 

          Estimate Std. Error t value Pr(>|t|)    
part.l1  -0.001671   0.037279  -0.045 0.964260    
tempr.l1  0.041465   0.065037   0.638 0.524090    
cmort.l1 -0.055240   0.057516  -0.960 0.337373    
part.l2  -0.033307   0.036628  -0.909 0.363676    
tempr.l2  0.162687   0.065449   2.486 0.013304 *  
cmort.l2 -0.035054   0.060277  -0.582 0.561168    
part.l3   0.041131   0.037031   1.111 0.267306    
tempr.l3 -0.093019   0.066347  -1.402 0.161623    
cmort.l3  0.051724   0.063211   0.818 0.413647    
part.l4   0.072928   0.036684   1.988 0.047437 *  
tempr.l4  0.013721   0.065758   0.209 0.834810    
cmort.l4  0.020643   0.059997   0.344 0.730961    
part.l5  -0.063262   0.037129  -1.704 0.089124 .  
tempr.l5  0.101658   0.065909   1.542 0.123703    
cmort.l5 -0.101582   0.057178  -1.777 0.076333 .  
const    67.569622  10.202446   6.623 1.04e-10 ***
trend    -0.001629   0.002738  -0.595 0.552077    
sd1      -1.095339   2.790946  -0.392 0.694909    
sd2      -0.252820   2.788031  -0.091 0.927788    
sd3      -0.554623   2.857129  -0.194 0.846173    
sd4      -4.655073   2.833825  -1.643 0.101171    
sd5       0.503427   2.870998   0.175 0.860887    
sd6      -0.869841   2.854107  -0.305 0.760689    
sd7       0.990936   2.931157   0.338 0.735474    
sd8       1.071027   3.033885   0.353 0.724243    
sd9      -2.118620   3.056004  -0.693 0.488513    
sd10     -0.276556   3.105167  -0.089 0.929073    
sd11      3.948477   3.189683   1.238 0.216424    
sd12      0.622946   3.282422   0.190 0.849568    
sd13      3.535714   3.308848   1.069 0.285859    
sd14      3.236139   3.328007   0.972 0.331394    
sd15      4.730365   3.398508   1.392 0.164665    
sd16      4.302211   3.464833   1.242 0.215024    
sd17      4.348371   3.563078   1.220 0.222975    
sd18      8.533858   3.602117   2.369 0.018266 *  
sd19      9.927234   3.743273   2.652 0.008294 ** 
sd20     12.842438   3.763791   3.412 0.000705 ***
sd21      9.868539   3.854746   2.560 0.010801 *  
sd22     10.639028   3.906700   2.723 0.006724 ** 
sd23      9.644718   3.979414   2.424 0.015772 *  
sd24     13.108459   4.022266   3.259 0.001206 ** 
sd25     12.884889   4.042996   3.187 0.001541 ** 
sd26     10.590151   4.089840   2.589 0.009937 ** 
sd27      8.979146   4.092458   2.194 0.028758 *  
sd28      8.316349   4.100874   2.028 0.043175 *  
sd29      9.496886   4.098027   2.317 0.020944 *  
sd30     10.888702   4.045803   2.691 0.007391 ** 
sd31      6.914916   4.049457   1.708 0.088421 .  
sd32      8.385632   3.967134   2.114 0.035103 *  
sd33     10.177830   3.941011   2.583 0.010133 *  
sd34      7.287770   3.910237   1.864 0.063028 .  
sd35      5.038358   3.830424   1.315 0.189084    
sd36      4.984268   3.702231   1.346 0.178911    
sd37      1.531341   3.642416   0.420 0.674387    
sd38      1.457768   3.573857   0.408 0.683549    
sd39      3.477229   3.388473   1.026 0.305371    
sd40     -1.924273   3.309112  -0.582 0.561200    
sd41     -1.304773   3.226832  -0.404 0.686153    
sd42     -1.972773   3.185774  -0.619 0.536079    
sd43     -1.775810   3.039361  -0.584 0.559341    
sd44     -4.662356   3.092086  -1.508 0.132323    
sd45     -1.696974   2.952895  -0.575 0.565804    
sd46     -4.672863   2.964822  -1.576 0.115729    
sd47     -9.498172   2.980996  -3.186 0.001545 ** 
sd48     -0.831837   2.960428  -0.281 0.778855    
sd49      0.227368   2.991535   0.076 0.939451    
sd50     -0.917113   2.932826  -0.313 0.754654    
sd51     -2.938001   2.820066  -1.042 0.298074    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 5.696 on 435 degrees of freedom
Multiple R-Squared: 0.6558, Adjusted R-squared: 0.6028 
F-statistic: 12.37 on 67 and 435 DF,  p-value: < 2.2e-16 


Estimation results for equation cmort: 
====================================== 
cmort = part.l1 + tempr.l1 + cmort.l1 + part.l2 + tempr.l2 + cmort.l2 + part.l3 + tempr.l3 + cmort.l3 + part.l4 + tempr.l4 + cmort.l4 + part.l5 + tempr.l5 + cmort.l5 + const + trend + sd1 + sd2 + sd3 + sd4 + sd5 + sd6 + sd7 + sd8 + sd9 + sd10 + sd11 + sd12 + sd13 + sd14 + sd15 + sd16 + sd17 + sd18 + sd19 + sd20 + sd21 + sd22 + sd23 + sd24 + sd25 + sd26 + sd27 + sd28 + sd29 + sd30 + sd31 + sd32 + sd33 + sd34 + sd35 + sd36 + sd37 + sd38 + sd39 + sd40 + sd41 + sd42 + sd43 + sd44 + sd45 + sd46 + sd47 + sd48 + sd49 + sd50 + sd51 

          Estimate Std. Error t value Pr(>|t|)    
part.l1  -0.036619   0.032115  -1.140  0.25482    
tempr.l1 -0.142525   0.056028  -2.544  0.01131 *  
cmort.l1  0.284268   0.049549   5.737 1.81e-08 ***
part.l2  -0.021725   0.031554  -0.688  0.49151    
tempr.l2 -0.059241   0.056383  -1.051  0.29399    
cmort.l2  0.333741   0.051928   6.427 3.42e-10 ***
part.l3  -0.054089   0.031902  -1.695  0.09070 .  
tempr.l3  0.030826   0.057157   0.539  0.58994    
cmort.l3  0.038001   0.054456   0.698  0.48565    
part.l4   0.061244   0.031603   1.938  0.05328 .  
tempr.l4  0.033040   0.056650   0.583  0.56004    
cmort.l4 -0.019117   0.051687  -0.370  0.71166    
part.l5   0.054080   0.031986   1.691  0.09160 .  
tempr.l5 -0.092971   0.056780  -1.637  0.10227    
cmort.l5 -0.041721   0.049258  -0.847  0.39747    
const    55.993042   8.789274   6.371 4.80e-10 ***
trend    -0.011889   0.002358  -5.041 6.81e-07 ***
sd1      -4.582957   2.404363  -1.906  0.05730 .  
sd2      -2.947629   2.401852  -1.227  0.22040    
sd3      -6.578345   2.461380  -2.673  0.00781 ** 
sd4      -4.769697   2.441304  -1.954  0.05137 .  
sd5      -4.018792   2.473327  -1.625  0.10492    
sd6      -6.467114   2.458776  -2.630  0.00884 ** 
sd7      -5.489959   2.525153  -2.174  0.03024 *  
sd8      -4.609526   2.613653  -1.764  0.07850 .  
sd9      -6.693363   2.632708  -2.542  0.01136 *  
sd10     -4.795991   2.675061  -1.793  0.07369 .  
sd11     -6.226300   2.747870  -2.266  0.02395 *  
sd12     -7.650352   2.827763  -2.705  0.00709 ** 
sd13     -4.407427   2.850530  -1.546  0.12279    
sd14     -8.322352   2.867035  -2.903  0.00389 ** 
sd15     -2.348161   2.927770  -0.802  0.42297    
sd16     -7.139528   2.984908  -2.392  0.01719 *  
sd17     -8.374923   3.069545  -2.728  0.00662 ** 
sd18     -7.096146   3.103177  -2.287  0.02269 *  
sd19     -3.315950   3.224781  -1.028  0.30439    
sd20     -0.769079   3.242457  -0.237  0.81262    
sd21     -5.635944   3.320813  -1.697  0.09038 .  
sd22     -5.116042   3.365571  -1.520  0.12921    
sd23     -4.586438   3.428214  -1.338  0.18164    
sd24     -5.490564   3.465130  -1.585  0.11380    
sd25      0.449638   3.482988   0.129  0.89734    
sd26     -4.547571   3.523343  -1.291  0.19749    
sd27     -5.645508   3.525599  -1.601  0.11004    
sd28     -6.081322   3.532850  -1.721  0.08590 .  
sd29     -4.730239   3.530397  -1.340  0.18099    
sd30     -2.384915   3.485407  -0.684  0.49418    
sd31     -3.381577   3.488555  -0.969  0.33292    
sd32     -2.232439   3.417634  -0.653  0.51396    
sd33     -3.286779   3.395130  -0.968  0.33354    
sd34     -1.362172   3.368618  -0.404  0.68614    
sd35     -5.115101   3.299861  -1.550  0.12185    
sd36     -2.747183   3.189423  -0.861  0.38952    
sd37     -2.287542   3.137894  -0.729  0.46639    
sd38      0.843316   3.078831   0.274  0.78429    
sd39     -2.820894   2.919126  -0.966  0.33441    
sd40     -1.032914   2.850757  -0.362  0.71728    
sd41     -3.007494   2.779874  -1.082  0.27990    
sd42     -2.193376   2.744503  -0.799  0.42462    
sd43      0.099277   2.618370   0.038  0.96977    
sd44     -0.991298   2.663792  -0.372  0.70997    
sd45      0.875592   2.543881   0.344  0.73087    
sd46      7.240833   2.554155   2.835  0.00480 ** 
sd47      6.080738   2.568089   2.368  0.01833 *  
sd48     -3.756551   2.550370  -1.473  0.14149    
sd49     -2.657159   2.577169  -1.031  0.30310    
sd50     -2.552333   2.526592  -1.010  0.31297    
sd51     -1.382398   2.429450  -0.569  0.56964    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 4.907 on 435 degrees of freedom
Multiple R-Squared: 0.7913, Adjusted R-squared: 0.7591 
F-statistic: 24.61 on 67 and 435 DF,  p-value: < 2.2e-16 



Covariance matrix of residuals:
       part tempr cmort
part  99.03 38.46 11.99
tempr 38.46 32.44  7.23
cmort 11.99  7.23 24.08

Correlation matrix of residuals:
        part  tempr  cmort
part  1.0000 0.6786 0.2455
tempr 0.6786 1.0000 0.2587
cmort 0.2455 0.2587 1.0000

LA Air Quality

Plot these in RStudio:

Code
impulse <- irf(reg)
plot(impulse)

Make predictions:

Code
H <- 26
p <- predict(reg, n.ahead = H)

xlimz <- c(min(time(cmort)), max(time(cmort)) + (H/52))
xz <- seq(max(time(cmort)) + (1/52), max(time(cmort)) + (H/52), 1/52)

LA Air Quality

Code
plot(cmort, xlim = xlimz)
lines(xz, p$fcst$cmort[,1], col = "tomato")

Code
plot(tempr, xlim = xlimz)
lines(xz, p$fcst$tempr[,1], col = "tomato")

Code
plot(part, xlim = xlimz)
lines(xz, p$fcst$part[,1], col = "tomato")

Coefficient Dynamics

Coefficient Dynamics

Thus far, we have assumed that the relationship between \(Y\) and \(X\) remains constant over time.

However, this is not always the case.

  • \(\text{Weight} = f(\text{Height})\) (Link)
  • \(\text{Unemployment} = f(\text{Inflation})\) (Phillips Curve)

Phillips Curve

Phillips Curve

Coefficient Dynamics

At a high level, there are a few ways to test, and account, for structural breaks.

  • Chow Break Test (strucchange)
  • Rolling Regression (rollRegres, note the one s)

Of course, you might want to identify breaks, but you might also want to rule them out and demonstrate the stability of your model. Both of these can help with this as well.

Chow Break Test

\[\frac{\Big(\frac{RSS_p - (RSS_1 + RSS_2)}{k}\Big)}{\Big(\frac{RSS_1 + RSS_2}{N_1 + N_2 - 2k}\Big)}\]

  • \(RSS_p\): Residual Sum of Squares for the entire (pooled) regression
  • \(RSS_1\): Residual Sum of Squares for the first section of data
  • \(RSS_2\): Residual Sum of Squares for the second section of data
  • \(k\): Number of Regressors

Chow Break Test

Code
N <- 200
shocks <- rnorm(N)
x <- rnorm(N)
t1 <- 1:N
t2 <- c(rep(0, .4*N), 1:(.6*N))

y <- 3*x + t1 - t2 + shocks

plot(t1, y, type = "l")

Chow Break Test

Testing for a break:

Code
strucchange::sctest(y ~ x, type = "Chow", point = 80)

    Chow test

data:  y ~ x
F = 171.05, p-value < 2.2e-16

But what if you don’t know where exactly the break happens?

Chow Break Test

Code
v <- rep(0, 180-20)
for(i in 20:180){
  
  v[i-19] <- strucchange::sctest(y ~ x, type = "Chow", point = i)$statistic
}

Chow Break Test

Code
plot(20:180, v, type = "b", pch = 19)

Chow Break Test

So what?

  • This can help you choose your analysis sample.
  • This can inform you of breaks you might want to model.
    • Remember: break models, trend models, kink models, etc.

Rolling Regression

Rolling regression is another tool to help you diagnose breaks or general dynamics in your model.

There are two ways you might use rolling regression:

  • Shifting Window
  • Expanding Window

Rolling Regression

Code
# shifiting window:
#r1 <- rollRegres::roll_regres(y ~ x + t1, width = 25L, do_downdates = TRUE)
# expanding window:
#r2 <- rollRegres::roll_regres(y ~ x + t1, width = 25L, do_downdates = FALSE)
#par(mfrow = c(1, 2))
#plot(r1$coefs[,"t1"], type = "b", pch = 19, main = "Shifting Window", ylim = c(-.1, 1.1))
#plot(r2$coefs[,"t1"], type = "b", pch = 19, main = "Expanding Window", ylim = c(-.1, 1.1))

Next Classes

Friday:

Please Submit Project Data!!

11/23:

  • No Class

11/30:

  • Forecast competition
  • Project Help

12/7:

  • Final Presentations